通过将微分方程(DES)和强化学习(RL)与域知识相结合,我们模拟阿尔茨海默病的疾病(AD)进展。 DES提供与广告相关的一些但不是全部因素之间的关系。我们假设缺失的关系必须满足关于大脑的工作的一般标准,例如,最大限度地提高认知,同时最小化支持认知的成本。这允许我们通过使用RL来优化捕获捕获上述标准的目标(奖励)函数来提取缺失的关系。我们使用由DES(作为模拟器)和训练的RL代理组成的模型,以预测合成和实际数据的基线(第0年)特征的个性化10年的广告进展。该模型可比较或更好地预测10年的认知轨迹,而不是最先进的基于学习的模型。我们的可解释模型展示,并提供了缓解广告效果的“恢复/补偿”过程的见解,即使这些过程在模型中未明确编码。我们的框架将DES与RL结合起来,以进行广告进展,并具有广泛适用性,以了解其他神经系统疾病。
translated by 谷歌翻译
Applying deep learning concepts from image detection and graph theory has greatly advanced protein-ligand binding affinity prediction, a challenge with enormous ramifications for both drug discovery and protein engineering. We build upon these advances by designing a novel deep learning architecture consisting of a 3-dimensional convolutional neural network utilizing channel-wise attention and two graph convolutional networks utilizing attention-based aggregation of node features. HAC-Net (Hybrid Attention-Based Convolutional Neural Network) obtains state-of-the-art results on the PDBbind v.2016 core set, the most widely recognized benchmark in the field. We extensively assess the generalizability of our model using multiple train-test splits, each of which maximizes differences between either protein structures, protein sequences, or ligand extended-connectivity fingerprints. Furthermore, we perform 10-fold cross-validation with a similarity cutoff between SMILES strings of ligands in the training and test sets, and also evaluate the performance of HAC-Net on lower-quality data. We envision that this model can be extended to a broad range of supervised learning problems related to structure-based biomolecular property prediction. All of our software is available as open source at https://github.com/gregory-kyro/HAC-Net/.
translated by 谷歌翻译
It is known that neural networks have the problem of being over-confident when directly using the output label distribution to generate uncertainty measures. Existing methods mainly resolve this issue by retraining the entire model to impose the uncertainty quantification capability so that the learned model can achieve desired performance in accuracy and uncertainty prediction simultaneously. However, training the model from scratch is computationally expensive and may not be feasible in many situations. In this work, we consider a more practical post-hoc uncertainty learning setting, where a well-trained base model is given, and we focus on the uncertainty quantification task at the second stage of training. We propose a novel Bayesian meta-model to augment pre-trained models with better uncertainty quantification abilities, which is effective and computationally efficient. Our proposed method requires no additional training data and is flexible enough to quantify different uncertainties and easily adapt to different application settings, including out-of-domain data detection, misclassification detection, and trustworthy transfer learning. We demonstrate our proposed meta-model approach's flexibility and superior empirical performance on these applications over multiple representative image classification benchmarks.
translated by 谷歌翻译
Social media platforms allow users to freely share their opinions about issues or anything they feel like. However, they also make it easier to spread hate and abusive content. The Fulani ethnic group has been the victim of this unfortunate phenomenon. This paper introduces the HERDPhobia - the first annotated hate speech dataset on Fulani herders in Nigeria - in three languages: English, Nigerian-Pidgin, and Hausa. We present a benchmark experiment using pre-trained languages models to classify the tweets as either hateful or non-hateful. Our experiment shows that the XML-T model provides better performance with 99.83% weighted F1. We released the dataset at https://github.com/hausanlp/HERDPhobia for further research.
translated by 谷歌翻译
Few-shot learning is a rapidly evolving area of research in machine learning where the goal is to classify unlabeled data with only one or "a few" labeled exemplary samples. Neural networks are typically trained to minimize a distance metric between labeled exemplary samples and a query set. Early few-shot approaches use an episodic training process to sub-sample the training data into few-shot batches. This training process matches the sub-sampling done on evaluation. Recently, conventional supervised training coupled with a cosine distance has achieved superior performance for few-shot. Despite the diversity of few-shot approaches over the past decade, most methods still rely on the cosine or Euclidean distance layer between the latent features of the trained network. In this work, we investigate the distributions of trained few-shot features and demonstrate that they can be roughly approximated as exponential distributions. Under this assumption of an exponential distribution, we propose a new maximum log-likelihood metric for few-shot architectures. We demonstrate that the proposed metric achieves superior performance accuracy w.r.t. conventional similarity metrics (e.g., cosine, Euclidean, etc.), and achieve state-of-the-art inductive few-shot performance. Further, additional gains can be achieved by carefully combining multiple metrics and neither of our methods require post-processing feature transformations, which are common to many algorithms. Finally, we demonstrate a novel iterative algorithm designed around our maximum log-likelihood approach that achieves state-of-the-art transductive few-shot performance when the evaluation data is imbalanced. We have made our code publicly available at https://github.com/samuelhess/MLL_FSL/.
translated by 谷歌翻译
成像检查(例如胸部X射线照相)将产生一小部分常见发现和一组少数罕见的发现。虽然训练有素的放射科医生可以通过研究一些代表性的例子来学习罕见条件的视觉呈现,但是教机器从这种“长尾”分布中学习的情况更加困难,因为标准方法很容易偏向最常见的类别。在本文中,我们介绍了胸部X射线胸腔疾病特定领域的长尾学习问题的全面基准研究。我们专注于从自然分布的胸部X射线数据中学习,不仅优化了分类精度,不仅是常见的“头”类,而且还优化了罕见但至关重要的“尾巴”类。为此,我们引入了一个具有挑战性的新长尾X射线基准,以促进开发长尾学习方法进行医学图像分类。该基准由两个用于19-和20向胸部疾病分类的胸部X射线数据集组成,其中包含多达53,000的类别,只有7个标记的训练图像。我们在这种新的基准上评估了标准和最先进的长尾学习方法,分析这些方法的哪些方面对长尾医学图像分类最有益,并总结了对未来算法设计的见解。数据集,训练有素的模型和代码可在https://github.com/vita-group/longtailcxr上找到。
translated by 谷歌翻译
在自动驾驶符号识别等任务中,强大的分类至关重要,因为错误分类的弊端可能是严重的。对抗性攻击威胁着神经网络分类器的鲁棒性,导致它们始终如一,自信地误导了道路标志。一种这样的攻击,基于阴影的攻击,通过应用自然的阴影来输入图像引起误解,从而导致人类观察者看起来很自然,但对这些分类器感到困惑。当前针对此类攻击的防御能力采用简单的对抗训练程序,分别在GTSRB和LISA测试集上获得相当低的25 \%和40 \%的鲁棒性。在本文中,我们提出了一种健壮,快速且可推广的方法,旨在在道路标志识别的背景下防御阴影攻击,以增强具有二进制自适应阈值和边缘图的源图像。我们从经验上表明了它针对影子攻击的稳健性,并重新制定了该问题,以表明其相似性$ \ varepsilon $基于扰动的攻击。实验结果表明,我们的边缘防御能力达到78 \%的鲁棒性,同时在GTSRB测试集上保持98 \%的良性测试精度,这是我们阈值防御的类似结果。链接到我们的代码是在论文中。
translated by 谷歌翻译
中文学习是指模型在及时序列中条件条件的能力,该序列由内部下文示例(输入输出对,与某些任务相对应)以及新的查询输入,并生成相应的输出。至关重要的是,内在学习仅在推理时间发生,而没有任何参数更新模型。尽管大型语言模型(例如GPT-3)具有某种能力来执行中文学习的能力,但尚不清楚任务成功的任务之间的关系以及培训数据中存在的内容。为了取得进步朝着理解文本学习的进步,我们考虑了训练模型的明确定义的问题,以学习函数类(例如,线性函数):也就是说,给定的数据从类中的某些功能衍生而成,可以我们训练一个模型以在此课程中学习“大多数”功能?我们从经验上表明,可以从头开始训练标准变压器,以执行线性函数的文本学习 - 也就是说,训练有素的模型能够从具有与最佳最小二乘估计器相当的性能的示例中学习看不见的线性函数。实际上,即使在两种形式的分配变化下,也可能进行中文学习:(i)模型的训练数据和推理时间提示之间,以及(ii)在推理过程中的内在示例和查询输入之间。我们还表明,我们可以训练变形金刚在文本中学习更多复杂的功能类,即稀疏线性功能,两层神经网络和决策树 - 具有匹配或超过特定于任务特定的学习算法的性能。我们的代码和模型可在https://github.com/dtsip/in-context-learning上找到。
translated by 谷歌翻译
单细胞RNA-seq数据允许在不断增长的一组生物环境中定量细胞类型差异。但是,确定了一小部分基因组特征来解释这种变异性可能是错误的,并且在计算上很棘手。在这里,我们介绍了MarkerMap,这是一种用于选择最小基因集的生成模型,这些基因集对细胞类型的起源提供最大信息,并启用整个转录组重建。MarkerMap为旨在识别特定细胞类型种群的监督标记选择提供了可扩展的框架,以及针对基因表达插补和重建的无监督标记选择。我们基于Markermap的竞争性能,以实现对真实单细胞基因表达数据集的先前发表的方法。MarkerMap可作为可安装的PIP软件包获得,可作为旨在开发可解释的机器学习技术的社区资源,以增强单细胞研究中的可解释性。
translated by 谷歌翻译
直接定位(DLOC)方法,该方法使用观察到的数据将源定位在一步过程中的未知位置,通常优于其间接的两步对应物(例如,使用到达的时间差异)。但是,水下声学DLOC方法需要对环境的先验知识,并且计算昂贵,因此很慢。我们建议,据我们所知,这是第一个数据驱动的DLOC方法。受经典和现代最佳模型的DLOC解决方案的启发,并利用了卷积神经网络(CNN)的功能,我们设计了一个基于CNN的整体解决方案。我们的方法包括专门量身定制的输入结构,体系结构,损失功能和渐进培训程序,在更广泛的机器学习背景下具有独立的兴趣。我们证明我们的方法优于有吸引力的替代方案,并且渐近地与基于Oracle的最佳模型解决方案的性能匹配。
translated by 谷歌翻译